Genetic associations with ratios between protein levels detect new pQTLs and reveal protein-protein interactions

Summary Protein quantitative trait loci (pQTLs) are an invaluable source of information for drug target development because they provide genetic evidence to support protein function, suggest relationships between cis- and trans-associated proteins, and link proteins to disease endpoints. Using Olink proteomics data for 1,463 proteins measured in over 54,000 samples of the UK Biobank, we identified 4,248 associations with 2,821 ratios between protein levels (rQTLs). rQTLs were 7.6-fold enriched in known protein-protein interactions, suggesting that their ratios reflect biological links between the implicated proteins. Conducting a GWAS on ratios increased the number of discovered genetic signals by 24.7%. The approach can identify novel loci of clinical relevance, support causal gene identification, and reveal complex networks of interacting proteins. Taken together, our study adds significant value to the genetic insights that can be derived from the UKB proteomics data and motivates the wider use of ratios in large-scale GWAS.


In brief
Protein quantitative trait loci (pQTLs) are a valuable resource for drug target development.Here, Suhre shows that a systematic analysis of ratios between protein pairs strengthens associations at known pQTL loci by several hundred orders of magnitude (p-gain), suggesting that these ratios reflect biological links between the implicated proteins.

INTRODUCTION
Large-scale studies of the blood circulating proteome leverage the natural variation in the general population to identify genetic and nongenetic factors that control blood protein levels. 1 Of particular interest for drug development are genome-wide association studies (GWASs) that identify protein quantitative trait loci (pQTLs), because they provide genetic evidence for a causal effect of the underlying variant, and hence the affected gene(s), on the levels of the associated protein(s) and their physiological effects.In cases of cis-pQTLs, in which the genetic variant is located in proximity of the gene coding for the associated pQTL protein, the effect is most likely through a causal variant that modifies transcription, translation, or stability of the cis-encoded protein.More complex, but also more rewarding in terms of potential biological insights, are trans-pQTLs because they suggest direct or indirect protein-protein interactions between the presumably causal cis-encoded protein and the associated trans-protein, which can extend into larger networks when multiple proteins are associated with the same variant and ideally also clinical endpoints of interest.
Such genetics-driven insights are of the highest value to pharmaceutical companies because they can inform drug target discovery and validation, generate hypotheses on modes of action, and suggest biomarkers for target engagement and efficacy.Early successes of pQTL studies [2][3][4][5] led to the creation of the UK Biobank Pharma Proteomics Project (UKB PPP) consortium, a precompetitive consortium of 13 biopharmaceutical com-panies that funded the measurement of over 54,000 UKB samples on the Olink Explore 1536 affinity proteomics platform.Olink uses a dual antibody binding technique, called the proximity extension assay, to quantify the abundance of almost 1,500 blood circulating proteins (Table S1).The UKB PPP consortium recently published initial results from a GWAS that identified over 10,000 pQTLs using this platform. 6The Olink proteomics data were released in April 2023 to the public and can be accessed and analyzed using the DNAnexus UKB Research Analysis Platform (RAP) platform (ukbiobank.dnanexus.com).The aim in this paper was to explore new methods to enhance pQTL discovery and interpretation using this exceptional and freely available dataset.
My group and others previously developed analysis strategies for GWAS with metabolomics data, 7,8 a field that is similar in many ways to that of pQTL studies.In particular, we showed that partial correlations between metabolites can reconstruct metabolic networks 9,10 and that the hypothesis-free testing of all of the ratios between metabolites can substantially strengthen the association signals, in several cases elevating genetic loci out of the background noise. 11,12Both approaches are related in that they identify biological relationships between individual molecules through their shared genetic and nongenetic variance, which can then be integrated into larger metabolic networks, such as the atlas of genetic influences of the human metabolome 13 and more recent versions thereof. 14Previous GWAS with proteomics suggest that Gaussian graphical models (GGMs) built from partial correlations and ratios between protein levels can reveal biologically relevant protein-protein interactions, 5 but the approach has never been tested at scale.This article hypothesizes that a GWAS with ratios between protein levels can identify associations and novel links between protein pairs that have not been identified using current GWAS approaches.However, the computational costs of conducting a full-fledged all-against-all ratio GWAS are prohibitive at this point, estimated to several hundred thousand pounds Sterling on the DNAnexus AWS-based platform for a single run of a full-fledged all-against-all ratio GWAS, not considering costs associated with the handling of the generated data.This challenge will be aggravated in the future by the expected increases in proteome coverage.
This article, therefore, takes a more economic approach and tests genetic associations with ratios between proteins that are partially correlated and therefore more likely to be related through some biological process. 15For each pQTL reported by the UKB PPP consortium that implicated one of two partially correlated proteins, the ratio between the levels of these two proteins is tested for association with the pQTL variant.Then, a GWAS is conducted on those ratios that increased the strength of association at an already known pQTL locus (see the flowchart of this study in Figure S1).It will be shown in the following that by using this approach pQTLs could be identified that were not discovered by the standard GWAS with protein levels conducted by the UKB PPP consortium, 6 and furthermore, that genetic associations with ratios can uncover biologically relevant links between two or more proteins based on their shared genetic and nongenetic variance.Selected cases of biomedical interest are discussed and an interpretation of why I believe ratios work is provided.

Identification of ratio QTLs (rQTLs) at established pQTL loci
The increase in the strength of an association with ratios is quantified by the p-gain, which is defined as the smaller of the two p values for the single-protein associations divided by that for the ratio association. 11A p-gain of 10 is the equivalent of a nominal p value for a single test; in other words, a p-gain of 10 is expected to be observed by chance in 5% of the cases when ratios between two random proteins are tested.The following requires Bonferroni levels of significance for p values and p-gains throughout and refers to protein ratio associations with significant p-gains as rQTLs.
Quality control (QC) of the Olink data was performed by the UKB PPP before the data was shared with the user.These QC steps include outlier removal and removal of samples of low quality.The number of data points that passed QC was above 97.6% for all Olink panels (see method details).We split the UKB cohort into a discovery set comprising 43,000 individuals and a replication set of 8,700 individuals based on the data field ''genetic ethnic grouping'' being equal/not equal to ''Caucasian'' (see UKB documentation on data field 22006) and further limit the analysis to samples collected at baseline.
A selection of pharmaceutically relevant rQTLs is provided in Table 1, including associations with a ratio between a cis-and a trans-located protein, where the cis-protein is the target of an approved drug (Tclin according to Pharos 16 ).

Identification of rQTLs in a GWAS with ratios
A GWAS was conducted on the 2,821 ratios using the genotyped UKB data.For each ratio, I retained the strongest associations that reached a Bonferroni level of significance of p value <5 3 10 À8 /2,821, a p-gain >10 7 3 2,821, and that were more distant than 1 million bp from any other significant association with the same ratio.I identified 8,462 ratio-variant pairs with 2,095 unique variants that satisfied this criterion, which corresponds to a discovery per tested ratio of on average three independent GWAS signals with a significant p-gain (Figures 2 and S2; Table S6).The ratios with the largest number of rQTLs discovered were heparin-binding EGF-like growth factor (HBEGF) divided by platelet-derived growth factor subunit A (PDGFA) (N = 25) and integrin subunit b1 binding protein 2 (ITGB1BP2) divided by microtubule interacting and transport, domain containing 1 (MITD1) (N = 24).A total of 999 proteins were implicated in at least one rQTL, with a median of eight rQTLs per protein.from any pQTL reported by the UKB PPP GWAS for one of the two proteins in the respective ratio and thus represent previously nonreported pQTLs, which corresponds to an increase of 24.7% in genetic signals derived from the UKB PPP Olink data using ratios compared to the standard approach.
To investigate whether these rQTLs provided new insights of biomedical interest, I annotated the 2,095 rQTL variants identified in this study and the 5,717 pQTL variants reported by the UKB PPP GWAS using PhenoScanner 17 for association with 446 distinct GWAS traits (Tables S6 and S3, respectively).I identified 322 rQTL variants that were more distant than 10 6 bp from any pQTL variant on the same GWAS trait, implicating 874 rQTLs in a total of 4,700 coassociations with GWAS traits (Table S7).
These rQTLs provide new evidence to support drug target selection.For instance, rs3764640 associated with the ratio STK11/USP8 (Àlog 10 (p) = 13.8, log 10 (p-gain) = 10.5).The variant is an intragenic SNP in the STK11 gene and associated with the presence versus absence of psychosis in Alzheimer disease (AD) cases. 18STK11 is a serine/threonine-protein kinase, and USP8 may play a role in the degradation of activated protein kinases by ubiquitination, 19 which would explain the significant p-gain for the ratio.This rQTL hence not only supports a role of STK11 in AD pathology but it also provides further insights into the putative underlying biological pathways, suggesting that medicinal modification of STK11 or its phosphorylation targets may affect the AD-related phenotypes.
A second example is a region of high linkage disequilibrium (LD) on chromosome 5 (Figure 3), which is a major risk locus for inflammatory bowel disease (IBD). 20The most likely causal gene prioritized by multiple GWAS, based on its function and the presence of Selected rQTLs that include a ratio between a cis-and a trans-located protein, with the cis-protein being the target of an approved drug (Tclin according to Pharos 16 ).The negative log 10 -transformed p values for the association with the single proteins (ÀlogP.1,ÀlogP.2) and the ratio (ÀlogP.3)and the log 10 -transformed p-gain (logPgain) for the rQTL are reported; the GWAS traits were annotated using PhenoScanner 17 (details in Table S4).
an amino acid-changing variant, was macrophage stimulating 1 (MST1).However, this view has been challenged, proposing glutathione peroxidase 1 (GPX1) as a causal gene instead, supported by biochemical experiments showing that a cosegregating amino acid-changing variant in GPX1 reduced the activity of this antioxidant enzyme. 21Here, 14 ratios between 16 proteins were identified that associated with a significant p-gain at this locus (Table S8).Seven were ratios of pyruvate kinase, liver, and red blood cells (PKLR) with proteins involved in hemoglobin metabolism, including hydroxyacylglutathione hydrolase (HAGH), hydroxymethylbilane synthase (HMBS), arginase 1, and biliverdin reductase B. The biochemical properties of these proteins clearly support a causal role for GPX1 in an oxidative stress-related phenotype, likely related to hemoglobin metabolism in red blood cells.However, four of the ratios were with the cis-encoded protein dystroglycan 1 (DAG1) and several proteins not related to red blood cell metabolism, suggesting the presence of a second, likely independent, causal gene at this locus, which would cosegregate with the GPX1 variant due to the high LD in this region.Whether both pathways are driving factors of the IBD association requires further investigation.Important for the present study is that this case exemplifies the kind of insights that can be drawn from using rQTLs and their value for drug target evaluation and hypothesis generation.

Discovery of cis-pQTLs
Observation of cis-pQTLs is considered genetic evidence to confirm the target specificity of the respective affinity binding assay.Sun et al. 6 found a cis-pQTL for 1,163 (79.5%) of the 1,463 assayed proteins.Here, I report 39 additional genetic variants that associated with a ratio that involves a protein located in-cis and a second protein located in-trans (Table 2).These cis-pQTLs became presumably discoverable because the trans-proteins in the ratios captured some unidentified shared nongenetic variance, accounting for which led to the significant p-gains.The corresponding 39 proteins include three Olink targets for which no genetic signal had been found in the UKB PPP GWAS at all (ARHGEF12, EIF4EBP1, INPPL1) and thus provide genetic evidence that the respective antibodies bind their designated targets.Even by using only a subset of all of the possible ratios, I identified 13% of the 300 cis-pQTLs that were not accounted for so far, increasing confidence in the target specificity of the Olink platform for these proteins.More may be identified in an all-against-all ratio approach.

Refinement of the rQTL loci
For economic reasons, the GWAS was conducted using the genotyped variants only and therefore I may have missed variants of interest.For each of the 8,462 rQTLs, I therefore refined the associations within ±500,000 bp of the respective lead variant by using the imputed UKB genotype data, both in the discovery and in the replication cohort.The summary statistics are provided for all 8,462 refined regions on FigShare (Data S1, https://doi.org/10.6084/m9.figshare.23695398).These data can also be used to further refine the loci of interest-for instance, to identify potentially multiple independent signals using SuSiE 23 or to test for colocalization with other traits of interest using coloc. 24To visualize individual rQTLs regional association plots were generated for all rQTLs, both in the discovery and the replication cohort (Figure S3).I then used coloc 24 to ask whether the two proteins in a ratio shared the same genetic signal (Q.12), whether any of the two proteins shared a signal with the ratio (Q.13 and Q.23), and whether the signal for the ratio was shared between discovery and the replication cohort (Q.33.repli).Table S6 provides the most likely hypothesis for each of these four questions, together with its posterior probability.In 7,414 (87.6%) of the 8,462 cases, at least one of the proteins shared a genetic signal with the ratio (Q.13 = H4 or Q.23 = H4), in 1,305 (15.4%) cases both proteins shared a signal with the ratio (Q.13 = H4 and Q.23 = H4), and in 489 (5.8%) cases there was no signal detectable for either of the two proteins alone (Q.13 = H2 and Q.23 = H2 and Q.12 = H0).A total of 6,775 of the 8,462 rQTLs (80.1%) shared a genetic signal between discovery and the replication cohort (Q.33.repli = H4).
For each rQTL region, the variant with the strongest association with the ratio in the discovery cohort was designated as the lead variant, and I asked whether the association on this variant replicated.Requiring in the discovery cohort p < 5 3 10 À8 /2,821 and p-gain >10 3 10 6 3 2,821 and in the replication cohort p <0.05/8,462 and p-gain >10 3 8,462, I identified 4,181 rQTLs (49.4%) that satisfied this stringent Bonferroni significance criterion.Considering that 80.1% of the rQTLs shared the same genetic signal between discovery and the replication cohort, it is likely that more rQTLs can be replicated when more samples become available.

Why do ratios work and what do they represent?
With P 1 and P 2 representing the levels of two blood circulating proteins (the indices of the individual samples were suppressed) two linear models can be fit to the log-scaled protein levels by selecting parameters a i , b i , and g i such that they minimize the square of the nonexplained variance ε i in the following equation: SNP represents the number of effect alleles (0, 1, 2) of a given genetic variant in a given sample and W denotes some noni-dentified nongenetic variance that is shared by both proteins.Using the identity log(A/B) = log(A) -log(B), the ratio can then be written as: Because the significance level (p-value) of the association with the variant depends on the proportion of the variance that is explained by the genetic term (SNP) compared to the remaining variance (W + ε), the strength of the association with the ratio can increase under two conditions: (1) when b 1 and b 2 have opposite signs, or (2) when g 1 and g 2 are of comparable size and nonzero.b 1 and b 2 having opposite signs implies that the genetic variant increases the levels of one protein while decreasing those of the other (Figure 4).When working with metabolites, this situation can occur when the ratio represents a substrate-product pair of an enzyme whose efficacy is affected by the genetic variant.Many such cases have been reported. 12,13,25For proteins, a possible scenario is a genetic variant that increases the expression of one protein that acts as a suppressor of a second protein.
One example is the association of rs1065853, which is in LD with coding SNP rs7412 in APOE, and the ratio between low-density lipoprotein receptor (LDLR) and proprotein convertase subtilisin/ kexin type-9 (PCSK9) protein levels (log 10 (p-gain) = 67.2).PCSK9 binds LDLR and targets it for degradation. 26The availability of PCSK9 to degrade LDLR in turn is limited by binding to apolipoprotein B, 27 the levels of which are associated with rs7412 in APOE.Plot created using LocusZoom. 22f only one of the proteins is affected by the genetic variant, then observing a significant p-gain implies that g 1 and g 2 must be of comparable size and nonzero, and the association with the ratio indicates the presence of some nongenetic variance that is shared by both proteins.For instance, ITGB1BP2 has only five trans-pQTLs in the UKB PPP GWAS of moderate effect size, but occurs with 26 different ratios in 259 rQTLs in the present study's GWAS, the strongest with a log10(p-gain) = 1,115.5 for the association of rs4680 with the ratio catechol-Omethyltransferase (COMT) divided by ITGB1BP2 at the COMT gene locus.The association of ITGB1BP2 with rs4680 was not significant (p = 0.69).The ratio with the largest number of rQTLs List of 39 genetic variants that associated with a ratio that involves a protein located less than 1 MB from the variant (cis-protein) and that has no cis-pQTL in the UKB PPP GWAS.The negative log 10 -transformed p values for the association with the single proteins (ÀlogP.1,ÀlogP.2) and the ratio (ÀlogP.3)and the log 10 -transformed p-gain (logPgain) for the rQTL are reported (details in Table S6).
a This protein has no pQTL in the UKB PPP GWAS.
in the GWAS was ITGB1BP2/MITD1, which had 24 rQTLs compared to only 2 pQTLs for MITD1 in the UKB PPP GWAS (Figure 5).Intriguingly, both proteins are highly correlated (Pearson r 2 = 0.86), suggesting that their correlation is driven by some shared but not identified factor.The strongest correlations with one of the clinical biochemistry and blood traits available in UKB was with platelet count (r 2 = 0.12 with ITGB1BP2, r 2 = 0.10 with MITD1, and r 2 = 0.025 with the ratio), which are too weak to explain the full correlation between both proteins, suggesting that a driving factor for this association is related to some more specific, probably blood cell-type-related trait that is not readily available in the UKB phenotype dataset.
There are multiple possible causes that can lead to a significant p-gain in a ratio association, as schematized in Figure 4; some rQTLs reveal the presence of shared genetic variance, whereas others suggest the proteins in the ratio being linked through some shared nongenetic processes.It is interesting to note in this context that adjusting for hidden factors by using genomic or proteomic surrogates can also lead to a substantial improvement in pQTL detection. 28at can be learned from rQTLs?To evaluate the enrichment in protein pairs that were linked through significant ratio associations and/or GGM edges, the STRING database of protein-protein interactions was used (Table S5).Of the 2,281 protein pairs, 168 pairs (6.0%) had a protein-protein interaction link in the STRING database with a high confidence score (>0.7), and random pairs between these proteins had on average only 22.1 links (SD = 4.2, based on 100 samplings), which corresponds to a 7.6-fold enrichment.For comparison, of the 11,936 protein pairs linked through significant GGM edges, 465 pairs (3.9%) had a protein-protein interaction reported in STRING, whereas random sampling yielded an average of 89.4 (SD = 9.4), which corresponds to a 5.2-fold enrichment for proteins linked through GGM edges alone.
Using cytokine and cytokine-receptor annotations from CytokineLink, 29 39 protein pairs were identified with an rQTL that involved cytokine-cytokine pairs, 15 receptor-receptor pairs, and 9 cytokine-receptor pairs, 7 of which were known Possible scenarios that can lead to a significant p-gain in a ratio association P1 and P2 are the proteins in the ratio that associates with the genetic variant SNP, X is the causal cis-encoded protein in the case of a trans-rQTLs, and W denotes some unidentified shared nongenetic variance.and 2 were new (colony-stimulating factor 1:lymphotoxin b receptor and C-X-C motif chemokine ligand 9 [CXCL9]:tumor necrosis factor receptor superfamily member 9 [TNFRSF9]).CytokineLink predicted 1,542 cytokine-cytokine interactions between 77 cytokines from the Olink platform (out of 77 3 76/2 = 2,926 possible interactions).The number of cytokine pairs that were involved in rQTLs was significantly enriched (30 out of 39 compared to 1,512 out of 2,887 with no rQTL, p < 0.002, Fisher exact test).
When multiple proteins associate in rQTLs with the same variant, networks of related proteins can be constructed.Here, an example is presented of how a single pQTL around a pharmaceutically interesting protein can be extended into a network of potentially interacting proteins.NFATC1 (nuclear factor of activated T cells 1) is a key transcription factor and regulator of the immune response 30 and a molecular target for immunosuppressive drugs such as cyclosporin A. 31 NFATC1 has been implicated in the pathogenesis and targeted therapy of hematological malignancies. 32,33rs657693 is a cis-pQTL for NFATC1 in the UKB PPP GWAS and one of only two genetic associations for this protein.rs657693 is identified here as an rQTL for the ratio of NFATC1 with 16 other proteins (AXIN1, BACH1, BANK1, BCR, CASP2, CD69, EIF4G1, FADD, FOXO1, IKBKG, INPPL1, IRAK1, LBR, PTPN6, SPRY2, TJAP1; Table S4), none of which had a significant association with this variant alone.Of the 16 proteins 9 had a second replicated rQTL in a ratio with NFATC1 elsewhere in the genome, and in all of these cases, the other protein in the ratio was the driving pQTL, with 5 of them being cis-pQTLs (AXIN1, BANK1, FOXO1, SPRY2, TJAP1).GWAS identified additional rQTLs, including cis-rQTLs for BCR (rs713617), cluster of differentiation 69 (CD69) (rs7309767), and Fas-associated protein with death domain (FADD) (rs7939734).Ingenuity Pathway Analysis identified a number of functional links between these proteins, generating a network of proteins that can now be linked through genetic evidence and rQTLs to the NFATC1 locus, potentially supporting the development of new immunosuppressive drugs (Figure S4).For instance, it has been shown that silencing NFATC1 results in the phosphorylation of forkhead box O1 (FOXO1) and thereby plays a role in cell differentiation. 34Expression of NFATC1 and nine of the proteins in its ratio (AXIN1, BCR, CASP2, CD69, EIF4G1, FADD, IRAK1, LBR, PTPN6) is enriched in leukemia cells (false discovery rate = 0.003). 35Interleukin 1 receptor associated kinase 1 (IRAK1) is an emerging therapeutic target in hematologic malignancies, and it has been suggested that IRAKs participate in regulatory interactions with FADD. 36Indeed, FADD has been shown to physically interact with IRAK1, 37 lending further experimental support to this rQTL-derived network and its role in T cell development.
To further explore the benefits of all-against-all ratios, the associations of SNP rs12075 were computed with all possible ratios between 76 cytokines that are in the Olink panel.rs12075 (1:159175354:G:A) is an amino acid changing variant (c.125G>A, p.Gly42Asp) in the atypical chemokine receptor 1 gene (ACKR1 aka DARC).The glycine variant defines the Fy a allele and the aspartate variant defines the Fy b allele of the Duffy blood group system. 38DARC is clinically important because it is the entry point for the human malaria parasite Plasmodium vivax.Individuals with two copies of the Fy a allele or a silenced Fy b allele are resistant to P. vivax infection.A structural basis for DARC binding to P. vivax's Duffy-binding protein involving the region around the p.Gly42Asp variant has been proposed. 39leven cytokines are associated in the UKB PPP GWAS at the ACKR1 locus (CCL2 [C-C motif chemokine ligand 2], CCL7, CCL8, CCL11, CCL13, CCL14, CCL17, CCL26, CXCL1, CXCL6, CXCL8; Table S3).CCL7 and CCL8 protein levels increased with the copy number of the Fy a allele, whereas levels of the other nine cytokines decreased with that variant.DARC controls chemokine levels through promiscuous binding. 40The associations with these cytokines thus match the function of DARC.In the discovery study, rs12075 associated with three ra-tios (CCL13/CCL8, CCL2/CCL7, and CCL11/CCL7; Table S4), and the strongest association of rs12075 was with the ratio between CCL8 and CCL13.An interaction between these two cytokines has been experimentally established by bidirectional immunoligand blotting. 41Testing the association of rs12075 with all possible ratios between the 76 cytokines on the Olink panel implicated 12 additional cytokines (CCL3, CCL4, CCL14, CXCL11, CXCL12, HGF, IL-7, PDGFA, TGFB1, THPO, TNFSF13, TNFSF14) in significant (p-gain > 10 10 ) rQTLs (Figure S5; Table S9).It goes beyond the scope of the present study to interpret these associations in further detail.The takeaway message here is that using ratios not only identified additional cytokines that associate with the Duffy blood type but they also suggest interactions between specific pairs of proteins, such as CXCL6, which occurs in a significant ratio with CCL8, but not with CCL7.Especially at pleiotropic loci, where multiple proteins associate with a clinically relevant variant, it may be worthwhile to conduct this kind of all-against-all ratio analysis, using a subset of functionally related protein, as done in this example, or even extending the ratio analysis to the full protein panel.

DISCUSSION
In this study, a GWAS at scale is presented with ratios between blood circulating protein levels, using the recently released Olink proteomics data for almost 1,500 proteins measured in the blood plasma of 54,000 participants of the UKB.Using ratios, increases in the strengths of association by up to several hundred orders of magnitude were observed, involving two-thirds of the proteins targeted by the Olink platform, increasing the strength of association at 16% of the pQTLs from the UKB PPP GWAS.Novel cis-pQTLs were reported for 13% of the 300 Olink proteins for which such a target-confirmatory QTL had not been identified so far and uncovered over 2,500 novel QTLs with ratios at loci that had not been highlighted by the UKB PPP GWAS using single protein levels, which corresponds to a 25% increase in the discovery rate.
I argue that ratios can account for unidentified genetic and/or nongenetic variance that is shared between the associated protein pairs (Figure 4).rQTL protein pairs were 7.2-fold enriched in known protein-protein interactions, demonstrating that they add substantial new information to hypothesis generation and providing a broad set of protein-protein relationships that can be mined using network pharmacology (the NFATC1 example) 42 and systems immunology (the ACKR1 example) 43,44 approaches.I further reported selected examples that illustrate insights gained from using ratios, adding new information to established loci (the GPX1 with IBD example) and identifying entirely novel loci (the STK11 with AD example).
Because I could only discuss a fraction of the biologically relevant findings in this paper, I freely share the full summary statistics of the GWAS using array-genotyped data and of the refinement using imputed genotype data, together with the corresponding Manhattan and regional association plots.These data represent a rich resource for biomedical hypothesis generation that complements the data generated by the UKB PPP GWAS and should be of particular value for pharmaceutical drug target development.
I hope to have demonstrated the benefits of analyzing ratios between protein levels at scale, an approach that I believe has already shown its benefits in the metabolomics field.Future work is needed to further speed up the calculation of ratio associations, especially in light of the broadening proteomics panels and their increased application in large-scale cohorts.Also, theoretical development and generalization of the concept of ratios to more general classes of multivariate association tests may further improve the power for the detection of genetic signals.

Limitations of the study
The following caveats apply: I analyzed only a subset of all of the possible ratios.Although the likelihood of finding a significant ratio association for a protein pair increased with the strength of their partial correlation, even for uncorrelated proteins, this likelihood remains high (5%), supporting the testing of all possible ratios in future GWAS, if resources permit.Indeed, I hope that the present study motivates further methods development that could render all-against-all ratio testing computationally feasible, and perhaps also more research into formal statistical methods that may generalize the analysis of combinations of quantitative traits as dependent variables in GWAS.
As for all GWAS, this study is hypothesis generating by nature.An rQTL suggests a potential relationship between the two pro-teins in the ratios, plus potentially a third protein encoded at the genetic locus in the case of trans-rQTLs.These relationships can be of different natures, ranging from a direct physical interaction over indirect regulatory interactions to broad shared nongenetic factors.Also, proteins may have multiple functions and can be associated with multiple pathways, which needs to be considered in the interpretation of the rQTLs on a one-by-one basis.I also acknowledge that there is presently no established causal inference framework, which somewhat limits the interpretation of the findings from a Mendelian randomization perspective.
It should also be noted that affinity proteomics technologies have numerous limitations, such as effects of epitope-changing variants, nonspecific binding, and uncertainty about target specificity.However, the many biologically relevant associations that have been derived using data from the Olink and other affinity proteomics platforms suggest that these concerns are of minor relevance.The fact that 39 novel cis-pQTLs were identified in this study provides further confirmatory evidence for the target specificity of their respective affinity binders.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

Figure 2 .
Figure 2. Two-dimensional Manhattan plots (A) The position of the rQTL plotted against the position of the genes coding for the two proteins in the ratio, the stronger of the two single protein associations is in blue, the weaker in red.(B) The positions of the genes coding for the two proteins in an rQTL ratio plotted against each other; darker colors indicate multiple rQTLs with the same ratio.

Figure 3 .
Figure 3. Regional association plot for the association of the HAGH/PKLR ratio at a major IBD locus on Chr3

Figure 4 .
Figure 4. Possible scenarios that can lead to a significant p-gain in a ratio association

Table 1 .
rQTLs that implicate drug targets in the cis-position
d RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d EXPERIMENTAL MODEL AND SUBJECT DETAILS d METHOD DETAILS d QUANTIFICATION AND STATISTICAL ANALYSIS d ADDITIONAL RESOURCES SUPPLEMENTAL INFORMATION